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Abstract 

This paper studies a class of incentive schemes based on intervention, where there exists an 

^ ^ intervention device that is able to monitor the actions of users and to take an action that affects 

. the payoffs of users. We consider the case of perfect monitoring, where the intervention device 
, can immediately observe the actions of users without errors. We also assume that there exist 
^ I actions of the intervention device that are most and least preferred by all the users and the 
I intervention device, regardless of the actions of users. We derive analytical results about the 
QQ ■ outcomes achievable with intervention, and illustrate our results with an example based on the 
Cournot model. 

H 

O ■ 

. . 1 Introduction 
O 

This paper studies incentive schemes to drive self-interested users toward the system objective. 
^ _ The operation of networks by non-cooperative, self-interested users in general leads to a suboptimal 

CO ! performance [Ij. As a result, different forms of incentive schemes to improve the performance have 

\ been investigated in the literature. One form of incentive schemes widely studied in economics and 

engineering is pricing (or more generally, transfer of utilities) [2J. Pricing can induce efficient use 
of network resources by aligning private incentives with social objectives. Although pricing has a 
. solid theoretical foundation, implementing a pricing scheme can be impractical or cumbersome in 

some cases. Let us consider a wireless Internet service as an example. A service provider can limit 
access to its network resources by charging an access fee. However, charging an access fee requires 
a secure and reliable method to process payments, which creates burden on both sides of users and 
service providers. There also arises the issue of allocative fairness when a service provider charges 
for the Internet service. In the presence of the income effect, uniform pricing will bias the allocation 
of network resources towards users with high incomes. Because the Internet can play the role of 
an information equalizer, it has been argued in a public policy debate that access to the Internet 
should be provided as a public good by a public authority rather than as a private good in a market 

Another method to provide incentives is to use repeated interaction [4J. Repeated interaction 
can encourage cooperative behavior by adjusting future payoffs depending on current behavior. A 
repeated game strategy can form a basis of an incentive scheme in which monitoring and punishment 
burden is decentralized to users (see, for example, [5J). However, implementing a repeated game 
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strategy requires repeated interaction among users, which may not be available. For example, users 
interacting in a mobile network change frequently in nature. 

In this paper, we study an alternative form of incentive schemes based on intervention, which 
was proposed in our previous work [6J. In an incentive scheme based on intervention, a network 
is augmented with an intervention device that is able to monitor the actions of users and to take 
an action that affects the payoffs of users. Intervention directly affects the network usage of users, 
unlike pricing which uses an outside instrument to affect the payoffs of users. Thus, an incentive 
scheme based on intervention can provide an effective and robust method to provide incentives 
in that users cannot avoid intervention as long as they use network resources. Moreover, it does 
not require long-term relationship among users, which makes it applicable to networks with a 
dynamically changing user population. 

As a first step toward the study of incentive schemes based on intervention, we focus in this paper 
on the case of perfect monitoring, where the intervention device can immediately observe the actions 
chosen by users without errors. We derive analytical results assuming that there exist actions of the 
intervention device that are most and least preferred by all the users and the intervention device, 
regardless of the actions of users. We then illustrate our results with an example based on the 
Cournot model. 

2 Model 

We consider a network where N users and an intervention device interact. The set of the users 
is denoted by = {1,...,A^}. The action space of user i is denoted by A^, and the action 
of user i is denoted by ai G Ai^ for all i G Af. An action profile is represented by a vector 
a = (ai, . . . , ttAr) G A = rTiGA/"^^' action profile of the users other than user i is written as 
= (ai, . . . , a^_i, a^+i, . . . ^a^) so that a can be expressed as a = (a^, a_^). The intervention 
device observes the actions chosen by the users immediately, and then it chooses its own action. 
The action space of the intervention device is denoted by Aq, and its action is denoted by ag G Aq. 
For convenience, we sometimes call the intervention device user 0. The set of the users and the 
intervention device is denoted by A/q = AA U {0}. 

The actions of the intervention device and the users jointly determine their payoffs. The payoff 
function of user i G A/q is denoted hj Ui : Aq x A ^ M.. That is, i/^(ao,a) represents the payoff 
that user i receives when the intervention device chooses action ao and the users choose an action 
profile a. In particular, the payoff of the intervention device, UQ(aQ,3.), can be interpreted as the 
system objective. Since the intervention device can choose its action knowing the actions chosen 
by the users, a strategy for it can be represented by a function f : A ^ Aq^ which is called an 
intervention rule. The set of all possible intervention rules is denoted by F. 

Suppose that there is a network manager who determines the intervention rule used by the 
intervention device. We assume that the manager can commit to an intervention rule, for example, 
by using a protocol embedded in the intervention device. The game played by the manager and 
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the users is called an intervention game. The sequence of events in an intervention game can be 
listed as follows. 

1. The manager chooses an intervention rule f ^ F. 

2. The users choose their actions a G A, knowing the intervention rule / chosen by the manager. 

3. The intervention device observes the action profile a G A and takes an action ao = /(a) G Aq. 

The payoff function of user i G A/q provided that the manager has chosen an intervention rule 
/ is given hj v{ : A — M, where 

^;/(a)=«i(/(a),a). (1) 

An intervention rule / induces a simultaneous game played by the users, whose normal form 
representation is given by 

r^ = (iv,(AW,K^w). (2) 

We can predict actions chosen by the users given an intervention rule / by applying the solution 
concept of Nash equilibrium to the induced game Tf. 

Definition 1. An intervention rule f ^ F sustains an action profile a* G A i/ a* is a Nash 
equilibrium of the game Tf, i.e., 

(a*) > v{ (a^, alj for all ai ^ Ai, for all i G A/*. (3) 

An action profile a* is sustainable if there exists an intervention rule f that sustains a*. 

Let £{f) <Z A be the set of action profiles sustained by /. Then the set of all sustainable action 
profiles is given hy £ — Uf^F£(f). A pair of an intervention rule / and an action profile a is said to 
be attainable if / sustains a. The manager's problem is to find an attainable pair that maximizes 
the payoff of the intervention device among all attainable pairs. 

Definition 2. (/*,a*) e F x A is an intervention equilibrium i/a* G ^(/*) and 

vl\ei*)>vliei) (4) 

for all (/, a) G F X A such that a G ^ (/). f^EFis an optimal intervention rule if there exists an 
action profile a* G A such that (/*,a*) is an intervention equilibrium. 

Intervention equilibrium is a solution concept for intervention games, based on a backward in- 
duction argument. An intervention equilibrium can be considered as a subgame perfect equilibrium 
applied to an intervention game, since the induced game is a subgame of an intervention game. 
It is implicitly assumed that the manager can induce the users to choose the best Nash equilibrium 
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for the system in case of multiple Nash equilibria. One possible explanation for this is that the 
manager recommends to the users an action profile sustained by the intervention rule he chooses 
so that the action profile becomes a focal point [7J. The manager's problem of finding an optimal 
intervention rule can be expressed as 

max max vlis). (5) 

feF e^esif) ^ ' ' 

3 Analytical Results 

In this section, we derive analytical results about sustainable action profiles and intervention equi- 
libria imposing the following assumption. 

Assumption 1. There exist ag, ag G Aq such that for all i E A/q; 

Ui{aQ^ a) > Ui{aQ^ a) > Ui{ao^ a) for all ao G Aq, for all a G A. (6) 

ttQ and ao can be interpreted as the minimal and maximal intervention actions of the intervention 
device, respectively. For given a G A, the users and the intervention device receive the highest (resp. 
lowest) payoff when the intervention device takes the minimal (resp. maximal) intervention action. 
This allows the intervention device to reward or punish all the users at the same time. 

We first characterize the set of sustainable action profiles, £. The following class of intervention 
rules is useful to characterize £. 

Definition 3. f^ - A ^ Aq is an extreme intervention rule with target action profile 3. ^ A if 

A(a)4? 'i:^'' (7) 

I ao otherwise. 

Note that an extreme intervention rule uses only the two extreme points of Aq. With an extreme 
intervention rule, the intervention device chooses the most preferred action for the users when they 
follow the target action profile while choosing the least preferred action when they deviate. Hence, 
an extreme intervention rule provides the strongest incentive for sustaining a given target action 
profile, which leads us to the following lemma. 

Lemma 1. //a* G £, then a* G £{fa*)' 

Proof. Suppose that a* G £. Then there exists an intervention rule / such that i/,^(/(a*), a*) > 
Ui{f{ai,a*_-),ai,3*_-) for ah a^ G Ai, for ah i G J\f. Then we obtain ?i^(aQ,a*) > ^x^(/(a*), a*) > 
7i^(/(ai, a^J, a^, a^J > i/^(ao, a^, a^J for all a^ G A^, for all i G AA, where the first and the third 
inequalities follow from ([6j). □ 

Let be the set of all extreme intervention rules, i.e., = {/a G F : a G A}. Also, define 
£^ = yjf^Fe£[f) = {a G A : 3/ G such that / sustains a}. By applying Lemma 1, we can 
obtain the following results. 



4 



Theorem 1. (i) a* G £^ i/ and only if Ui{aQ^ai') > ix^(ao, a^, a^J for all ai G Ai, for all i G J\f. 
(ii) £ = £^. 

(in) //(/*, a*) is an intervention equilibrium, then (/a*, a*) is also an intervention equilibrium. 

Proof, (i) Suppose that Ui(aQ^8i') > ix^(ao, a^, a^J for all G A^, for all i G J\f. Then /a* sustains 
a*, and thus a* G The converse follows from Lemma 1. 

(ii) £ D £^ follows from F D F^, while £ C £^ follows from Lemma 1. 

(iii) Suppose that (/*,a*) is an intervention equilibrium. Then by Definition [2l /* sustains a*, 
and vf (a*) > (a) for ah (/, a) G F x A such that 3.e £{f). Since a* G a* G ^(/a*) by Lemma 
1. Hence, Vq (a*) > i/o(/a*(a*), a*). On the other hand, since /a*(a*) = ag, we have Vq (a*) < 
^o(/a*(a*),a*) by ©. Therefore, i;^*(a*) = ixo(/a*(a*), a*), and thus ixo(/a*(a*), a*) > i;^(a) for 
all (/, a) G F X A such that a G This proves that (/a*, a*) is an intervention equilibrium. □ 

Theorem 1 shows that there is no loss of generality in three senses when we restrict attention 
to extreme intervention rules. First, in order to test whether there exists an intervention rule that 
sustains a given action profile, it suffices to consider only the extreme intervention rule having the 
action profile as its target action profile. Second, the set of action profiles that can be sustained by 
an intervention rule remains the same when we consider only extreme intervention rules. Third, if 
there exists an optimal intervention rule, we can find an optimal intervention rule among extreme 
intervention rules. 

Note that the role of extreme intervention rules is analogous to that of trigger strategies in 
repeated games with perfect monitoring. To generate the set of equilibrium payoffs, it suffices to 
consider trigger strategies that trigger the most severe punishment in case of a deviation. Under 
Assumption [H the maximal intervention action ag plays a similar role to mutual minmaxing [4J in 
that it provides the strongest threat to deter a deviation. The next theorem provides a necessary 
and sufficient condition under which an extreme intervention rule together with its target action 
profile constitutes an intervention equilibrium. 

Theorem 2. (/a*, a*) is an intervention equilibrium if and only z/a* G £ anrf ^xo(%5 a*) > t^o(^o,a) 
for all a G f . 

Proof. Suppose that (/a*, a*) is an intervention equilibrium. Then /a* sustains a*, and thus a* G £. 
Also, i/o(/a*(a*),a*) > vl{8i) for ah (/, a) G F x A such that a G £{f)- Choose any s. e £. Then 
by Lemma 1, /a sustains a, and thus ^xofeo^a*) = i/o(/a*(a*), a*) > i/o(/a(a),a) = uo{aQ,3i). 

Suppose that a* G £^ and i^o(^o,a*) > uo(aQ^3.) for all s. ^ £. To prove that (/a*, a*) is an 
intervention equilibrium, we need to show (i) /a* sustains a*, and (ii) iXo(/a*(a*), a*) > Vq{3.) for 
all (/, a) ^ F X A such that a G £{f)- Since a* G (i) follows from Lemma L To prove (ii), choose 
any (/, a) G F x A such that a G £{f). Then i^o(/a* (a*), a*) = t^o(«i05a*) > UQ{aQ,3) > '^0(^)5 
where the first inequality follows from 3. ^ £. □ 

Theorem [2] implies that if we obtain an action profile a* such that a* G arg maxa^f: '^0(^0? a), 
we can use it to construct an intervention equilibrium and thus an optimal intervention rule. 
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Figure 1: Contour lines of social welfare in the Cournot duopoly game. 

4 Illustrative Example 

In this section, we discuss an example to illustrate the results in Section 3. Consider a wireless 
network with two users and an intervention device interfering with each other. The action of user 
i is its usage level, where Ai — [0, aj for z = 0, 1, 2. ai is the maximum usage level of user i. The 
total usage level is given by ao + ai + a2. The quality of service is determined by the total usage 
level, following the relationship 

(5(ao, ai, a2) = - h{a^ + ai + a2)] + , (8) 

where q,b > and = max{x, 0}. The payoff of user i G {1, 2} is given by the product of the 
quality received and its usage level, 

Ui{ao, ai, a2) = Q(ao, ai,a2)ai. (9) 

The system objective is given by social welfare, which is defined as the sum of the payoffs of the 
users, 

i^o(tto, tti, a2) = t^i(ao, ai, a2) + 'U2(ao, ai, a2). (10) 

Note that if there is no intervention device (i.e., if ao is held fixed at 0), the example is identical to the 
Cournot duopoly model with a linear demand function and zero production cost. The corresponding 
Cournot duopoly game achieves the symmetric social optimum at ai — a2 — cll '•— q/^h while it 
has the unique Cournot-Nash equilibrium dX ai — a2 — an '•— ^/36, as depicted in Figure [H Hence, 
the goal of the manager is to improve upon the inefficient outcome {an^ciH) by introducing the 
intervention device in the network. 

Given the structure of the intervention game in this example, the capability of the interven- 
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tion device is determined by its maximum intervention level ag. In the following, we investigate 
sustainable action profiles and those that constitute an intervention equilibrium as we vary ag. 

Proposition 1. (i) Ifao = 0^ then £ = {(a/f,a/f)}. 
(ii) Ifao > q/h, then £ — A. 

(in) Ifao ^ (3\/2 — 4)g/4\/2&; then {{aL,aL)} G £ and thus {{aL,aL)} constitutes an intervention 
equilibrium. 

If the intervention device cannot affect the payoffs of the users (ag = 0), the non-cooperative 
outcome {aH^an) is the only sustainable action profile that is consistent with the self-interest of 
the users. On the other hand, if the intervention device can apply a sufficiently high intervention 
level (ao > q/h)^ it has the ability to degrade the quality to zero no matter what action profile 
the users choose. Since the payoffs of the users are non-negative, the punishment from using ao is 
strong enough to make every action profile sustainable. We can also find a condition on ao that 
enables f(^aL,aL) sustain the symmetric social optimum (aL^aL). With ao > (3\/2 — 4)g/4\/26, 
{aL^a^) is sustainable and thus {f(aL,aL)^ (ciL^aL)) is an intervention equilibrium by Theorem 2. 

Figure [2] plots the set £ for six different values of ao with parameters g = 12,6 = 1, and ai = 
a2 = 12. We can see that £ expands as ao increases, starting from a single point {an^aH) — (4,4) 
when ao = to the entire space A when ao > q/h = 12. When ao < (3\/2 - 4)g/4\/26 ^ 0.51, only 
the action profile that is closest to {a^^aL) — (3,3) among those in £ constitutes an intervention 
equilibrium. When ao > (3a/2 — 4)g/4\/26 ^ 0.51, the action profiles in £ that satisfies ai + a2 = 
2aL = 6 constitute an intervention equilibrium, as all of them yield the maximum social welfare. 
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